Using Time-Synchronous Phone Co-occurrences in a SVM-Phonotactic Dialect Recognition System
نویسندگان
چکیده
This paper presents a simple approach to phonotactic dialect recognition which uses lattices of time-synchronous phone cooccurrences at the frame level. In previous works, we successfully applied cross-decoder phone co-occurrences to improve performance in language recognition experiments on the 2007 NIST LRE database. We call phone co-occurrence to the simultaneous (time-synchronous) presence of two phone units coming from two different phone decoders. In this work, the approach is ported to a Dialect Recognition task based on the assumption that co-occurrences can better represent the tiny differences among the dialects. Besides, a slightly different approach is presented, based on the simultaneous presence of two phone units in the lattice produced by a single decoder (intra-decoder phone co-occurrences). For evaluating the approach, a choice of open software (Brno University of Technology phone decoders, HTK, SRILM, LIBLINEAR and FoCal) was used, and experiments were carried out on the Arabic dialects of the NIST 2011 LRE database. The approach based on cross-decoder phone co-occurrences outperformed the baseline phonotactic system, yielding around 8% relative improvement. The fusion of both systems yielded 7.31% EER and CLLR = 0.497, meaning 19% relative improvement.
منابع مشابه
On the Use of Lattices of Time-Synchronous Cross-Decoder Phone Co-Occurrences in a SVM-Phonotactic Language Recognition System
This paper presents a simple approach to phonotactic language recognition which uses Lattices of Time-Synchronous CrossDecoder Phone Co-occurrences at the frame level. In previous works we have successfully applied cross-decoder information, but using statistics of n-grams extracted from 1-best phone strings. In this work, the method to build and properly use lattices of cross-decoder phone co-...
متن کاملUsing cross-decoder co-occurrences of phone n-grams in SVM-based phonotactic language recognition
Most common approaches to phonotactic language recognition deal with several independent phone decoders. Decodings are processed and scored in a fully uncoupled way, their time alignment (and the information that may be extracted from it) being completely lost. Recently, we have presented a new approach to phonotactic language recognition which takes into account time alignment information, by ...
متن کاملDialect recognition using a phone-GMM-supervector-based SVM kernel
In this paper, we introduce a new approach to dialect recognition which relies on the hypothesis that certain phones are realized differently across dialects. Given a speaker’s utterance, we first obtain the most likely phone sequence using a phone recognizer. We then extract GMM Supervectors for each phone instance. Using these vectors, we design a kernel function that computes the similaritie...
متن کاملModeling code-Switching speech on under-resourced languages for language identification
This paper presents an integration of phonotactic information to perform language identification (LID) in a mixed-language speech. A single-pass front-end recognition system is employed to convert the spoken utterances into a statistical occurrence of phone sequences. To process such phone sequences, a hidden Markov model (HMM) is utilized to build robust acoustic models that can handle multipl...
متن کاملDiscriminative Phonotactics for Dialect Recognition Using Context-Dependent Phone Classifiers
In this paper, we introduce a new approach to dialect recognition that relies on context-dependent (CD) phonetic differences between dialects as well as phonotactics. Given a speech utterance, we obtain the phone sequence using a CD-phone recognizer. We then identify the most likely dialect of these CDphones using SVM classifiers. Augmenting these phones with the output of these classifiers, we...
متن کامل